Shannon’s Entropy of The Stochastic Context-Free Grammar and an Application to RNA Secondary Structure Modeling

نویسنده

  • Amirhossein Manzourolajdad
چکیده

Stochastic context-free grammars (SCFG) have been used in RNA Secondary structure modeling. An SCFG consists of a set of grammar rules with probability for each. Given a grammar design, finding the best set of probabilities that yield optimum performance can be challenging. Although current Expectation Maximization (EM) MaximumLikelihood (ML)-based model training approaches have been effective, there is no guarantee that they provide parameter sets for the grammar to have optimum performance. In this work, An analytical measure of the SCFG space, denoted here as Grammar Space (GS) entropy, is introduced and calculated for various SCFG models in the literature. It is shown that more accurate models have lower GS entropy. Finally, based on the GS entropy, a novel RNA structure model training method is proposed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic k-Tree Grammar and Its Application in Biomolecular Structure Modeling

Stochastic context-free grammar (SCFG) has been successful in modeling biomolecular structures, typically RNA secondary structure, for statistical analysis and structure prediction. Context-free grammar rules specify parallel and nested co-occurren-ces of terminals, and thus are ideal for modeling nucleotide canonical base pairs that constitute the RNA secondary structure. Stochastic grammars h...

متن کامل

Stochastic Context-Free Grammars and RNA Secondary Structure Prediction

This thesis focus on the prediction of RNA secondary structure using stochastic context-free grammars (SCFG). The RNA secondary structure prediction problem consists of predicting a 2-dimensional structure from a 1-dimensional nucleotide sequence. The theory behind SCFG is explained and an overview of the research literature on various methods in the field of secondary structure prediction is g...

متن کامل

RNA Modeling Using Gibbs Sampling and Stochastic Context Free Grammars

A new method of discovering the common secondary structure of a family of homologous RNA sequences using Gibbs sampling and stochastic context-free grammars is proposed. Given an unaligned set of sequences, a Gibbs sampling step simultaneously estimates the secondary structure of each sequence and a set of statistical parameters describing the common secondary structure of the set as a whole. T...

متن کامل

An evolutionary algorithm for stochastic context-free grammar design, with applications to RNA secondary structure prediction

Stochastic Context-Free Grammars (SCFGs) have been used widely in modelling RNA secondary structure. They were motivated by the use of Hidden Markov Models (HMMs) in protein modelling (Krogh et al., (1993)). What was lacking in HMMs though, was the ability to model long range interactions which are necessary to provide an effective model for RNA secondary structure. Thus, SCFGs, as generalisati...

متن کامل

Stochastic modeling of RNA pseudoknotted structures: a grammatical approach

MOTIVATION Modeling RNA pseudoknotted structures remains challenging. Methods have previously been developed to model RNA stem-loops successfully using stochastic context-free grammars (SCFG) adapted from computational linguistics; however, the additional complexity of pseudoknots has made modeling them more difficult. Formally a context-sensitive grammar is required, which would impose a large...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015